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Abstract 

The number of fixed mutations accumulated in an evolving population often displays a variance 
that is significantly larger than the mean (the overdispersed molecular clock). By examining a 
generic evolutionary process on a neutral network of high-fitness genotypes, we establish a formal- 
ism for computing all cumulants of the full probability distribution of accumulated mutations in 
terms of graph properties of the neutral network, and use the formalism to prove over dispersion of 
the molecular clock. We further show that significant over dispersion arises naturally in evolution 
when the neutral network is highly sparse, exhibits large global fluctuations in neutrality, and 
small local fluctuations in neutrality. The results are also relevant for elucidating the topological 
structure of a neutral network from empirical measurements of the substitution process. 
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Introduction. - The neutral theory of molecular evolution jl| posits that most sequence 
substitutions at the nucleic acid or protein level are selectively neutral and do not appreciably 
alter the activity of the molecule in which they occur or the fitness of the host organism. It 
predicts that the number of substitutions accumulated in an evolving population of sequences 
in time t follows a Poisson distribution with mean fiut, where fi is the mutation rate per 
sequence per generation and u is the average fraction of neutral mutations (also called the 



neutrality). This prediction gives a simple explanation to the "molecular clock" [2j] - the 
idea that the number of accumulated fixed mutations in a population is proportional to the 
time elapsed - and implies that the variance in this number must equal its mean, leading to 
an index of dispersion (defined as the variance divided by the mean) of 1. 

However, experimental studies often find that the index of dispersion is significantly larger 
than 1 (the overdispersed molecular clock) This finding can be reconciled with the 

neutral theory by assuming that the space of neutral sequences has fluctuating neutrality 3] , 
causing the substitution process to be non-Poissonian, as verified by computer simulations 
By , lid] that show significant over dispersion when the product Nfj, of the population size 
N and the mutation rate fi is much smaller than 1. 

There is limited theoretical understanding of the nature of the molecular clock. Cutler 



ll| formally calculated the index of dispersion in terms of statistics of the mutation and 



fixation processes and argued that slow fluctuations in evolutionary parameters could lead 
to significant overdispersion in simple evolutionary models. Recent analytical results include 
a derivation of the index of dispersion for neutrally evolving protein populations constrained 
by a stability requirement 12| . These results do not conclusively prove overdispersion of the 
molecular clock in a sufficiently general scenario, nor do they give an explicit characterization 
of the non-Poissonian nature of the full probability distribution of accumulated mutations. 
A natural stage for fluctuating neutrality is presented by a neutral network 
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, ll7|, ll8[ of high- and equal-fitness genotypes in which two genotypes are linked by 
an edge if they differ by a single point mutation. The aim of this Letter is to theoretically 
clarify the non-Poissonian nature of the distribution of accumulated fixed mutations, to 
relate all cumulants of this distribution to graph invariants of the neutral network, to prove 
overdispersion of the molecular clock, and to identify features of the neutral network that 
could lead to significant overdispersion. We assume Nfi <^ 1, as is relevant for the majority 
of organisms in the plant and animal kingdom lOj. For this limit to be valid, it is also 
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necessary that /i <^ 1, as we assume below. We further assume that the neutral network is 
a connected graph; if it is not connected, the results below apply separately to populations 
evolving on each connected component of the neutral network graph. 

Substitution process when Nfi ^ 1. - Consider a population of individuals evolving on 
a neutral network, represented by a graph (5 with n nodes, E edges, and adjacency matrix G. 
The nodes of C5 represent high- fitness genotypes characterized by sequences of length L over 
an alphabet of size A. Two nodes are connected by an edge if the corresponding genotypes 
differ by a single point mutation. The neutrality of a node r in C5 is d,./ {L{A — 1)), where dr 
is the degree of r in 0, and represents the fraction of point mutations of the genotype r that 



are neutral. Following [17|], we consider a discrete mutation-selection dynamics in which at 
each generation an individual suffers a point mutation with fixed probability fi that moves 
it to a neighboring genotype (which may or may not be of high fitness). individuals are 
then selected with replacement from the mutated population with probability proportional 
to their fitness, and the process is repeated. For <^ 1, the population at any point in 
time is converged on a single node of the neutral network [17|. At each generation it either 
stays at its current node or moves effectively as a single entity to a neighboring node. The 
probability pt{r) that the population is on node r at time t is governed by the equation 



= (I-/iD + /iG)pt_i = (I-/iL)pt_i, (1) 

where pt{r) is the rth element of p^, I is the n x n identity matrix, fi = fi/{L{A — 1)) is 
the reduced mutation rate, D is a diagonal matrix with node degrees on the main diagonal, 
and L = D — G is the graph Laplacian of <3. The term I — /ID represents the probability 
that the population stays at its current node (either due to no mutation or a deleterious 
mutation that is culled by selection), and the term flG represents the probability that the 
population moves to a neighboring node. 

L is a symmetric, positive semi-definite matrix and, if & is connected, has exactly one 
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20]. We denote eigenvalues of L by 



zero eigenvalue and all other eigenvalues positive 

< < A2 < . . . < A„_i, with Ao = 0. Further, because Ylij-^ij — — 0; 

eigenvector of L corresponding to Aq is proportional to 1, the column vector with all entries 
equal to 1. The properly normalized limiting distribution over is lim p^ = n~^l, i.e., all 



nodes are occupied with equal probability |17 |. 



Consider the joint distribution pt{r,m), representing the probabihty that the population 



is on node r at time t and m neutral substitutions have accumulated since time (see 12 1 
for a similar representation). The dynamics of the joint process is 

Pt(m) = (I - /iD) Pf_i(m) + /iGpt_i(m - 1), (2) 

where the rth element of Pt{m) is pt{r,m). Assuming an equilibrated population at t = 0, 
the initial condition for Eq. ([2]) is po(^) = Sm,on~^l- 

To solve ([2]), it is convenient to define the vector moment generating function (mgf) 

oo 

Qt(^) = e™'^pt(m). Noting that qo(6') = n~^l, multiplying both sides of Eq. by e"*^, 

771=0 

summing over all m, and finally solving the resulting equation yields 

qt(^) =n-i (l-/iL + /i(e'- 1)G)*1. (3) 

The mgf qt{6) for the distribution of accumulated mutations is found by marginalizing over 
the vector mgf: qt{0) = J2r ^) ~ l^Qt(^)) where the superscript T denotes the transpose 
operation. This yields 

qtie) = (I - /iL + il{e' - 1)G)* 1. (4) 

The probability pt{fn) that m mutations have accumulated in time t may be recovered as the 
coefficient of e™^ in the above mgf. Moments of pt(m) are obtained in the usual manner by 
taking multiple derivatives of Eq. (jlj) with respect to 6. This procedure, however, becomes 
increasingly cumbersome for the calculation of higher moments, primarily because L and G 
do not, in general, commute. We therefore directly consider the late time and small fi limit 
of the mgf qt{0) below. 

Late time behavior and cumulants. - Consider a time scale long enough so that a suffi- 
ciently large number of mutations have accumulated in the population, i.e., t ^ jl^^. It is 
then convenient to measure time in units of define a rescaled time variable t = jit, and 
examine Eq. (jl]) in the limit t ^ 1 and /i ^ 1. Equation (jl]) may be rewritten as 

q,{e) = n-'l^ (I - + /i(e^ - 1)G)*/^ 1, (5) 
^l^exp [t((e'-l)G-L)] 1, (6) 



where we have used /x <^ 1 in making the approximation above. We now introduce the 
spectral expansion 

n-l 

(e^ _ 1)G - L = ^ A,(e)u«(e)u«^(e), (7) 



i=0 



where {u'^*)(^)} is an orthonormal basis of eigenvectors of (e^ — 1)G— L with eigenvalues Xi(9) 
ordered in decreasing order. Note that Ai(0) = — Aj (eigenvalues of — L), and in particular, 
Ao(0) = and u'^°^(0) = n^^/^l. For large i, Eq. ([6]) is dominated by the leading term in 
Eq. ([7]), corresponding to the largest eigenvalue Ao(^^). Thus, in the late time limit, the 
cumulant generating function In qt{6) and associated cumulants {k(^j)} are given by 

\nq,{e) ^ iXoie), %) ^ t_Ao(^)|^.=o. (8) 

Since Ao(^^) only depends on the topology of the neutral network graph, it follows that the 
ratio of any 2 cumulants depends only on the topology of the neutral network graph, and 
not on fl and t, at late times. 

To obtain explicit formulae for the cumulants, we need to find Xo{G) to any desired order 
in powers of 6. This is carried out in a recursive manner: expand Xo{9) and uW(e) m power 
series in 9, 

oo oo 

Xo{9) = J2xil^9^, uW(0) = 5^u(°'^V, (9) 

j=0 j=0 

substitute these expansions in the eigenvalue equation 



[(e^ - 1)G - L] u^°\9) = Ao(e)u(°)(^), 



(10) 



and compare the coefficients of equal powers of 9 on both sides of the above equation. Noting 
that Aq°^ = 0, comparison of coefficients of 9'^ on both sides of Eq. (fTOj) yields u*^°'°^ = n^^/^l, 
and for j > 0, 



E 

1=1 



G 



L(j-0! 



A 



u 



(0,0 



(11) 



where d is a column vector containing node degrees (the main diagonal of D), and it is 
understood that the sum on the right hand side vanishes for j = 1. Multiplying both sides 
of Eq. ( fTTj) by 1^ and using l^L = 0, one obtains, for j > 0, 



A 



(i) 



d 



+ 



1 

-Y 

1=1 



AT_xij-l)-i_T 



U 



(0,0 



(12) 



.{j-iy 

where d = d^ = 2n~^E is the average degree of 0. Equation fll2l) recursively 

in terms of Aq'^ 



expresses A[/^ in terms of Aq'^'' and u^'^''^^ for k < j. To find u^^''^^ one may consider inverting 
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L in Eq. (fTT]) . However L, since it has a zero eigenvalue, has no inverse. We therefore 
introduce a pseudo- inverse of L, denoted L"*", and defined by the spectral expansion 



n-l 



L+ = 5^Ariu«(0)u«^(0). (13) 

1=1 

Note that we have omitted the zero eigenvalue in carrying out the inversion. is a positive 
semi-definite symmetric matrix with 1"^L+ = L+1 = 0. Equation flTT]) may now be solved by 
writing u(°'J') as L+ multiplying the right hand side plus an arbitrary vector in the null space 
of L. However, since L has only a single zero eigenvalue, this null space is 1-dimensional. 
Further, 1 lies in this null space; the null space is therefore spanned by 1, and the solution 
to Eq. is 



u 



^(|:u(«-'.v«^«),. (14) 



where the coefficient multiplying 1 above is found, after some algebra, by expanding the 
normalization condition u^^\9)'^ u^^\6) = 1 in powers of 9. 

Equations (fT2l) and (fT4l) . together with the starting conditions Ag^'' = and u'^^'''^ = 
?T,~^/^1, are coupled nonlinear equations that allow one to recursively find Aq"''' (and therefore 
/c(j)) for all j. For example, using these equations and noting that k(_j) = tj!Ao \ the first 
three cumulants at late times are obtained as 

k(i) =td, (15) 
— 2t 

fc(2) =td + — d^L+d, (16) 



k(3) =td H 

' n 



d^L+d - 2dd^L+^d + d^L+GL+d 



Since a Poisson distribution with the same mean has all cumulants equal to td (obtained from 
the first term in Eq. (fT2l) ). Eq. (fT2l) shows systematic departures from Poissonian behavior at 
all cumulant orders in a manner that depends purely on the topology of the neutral network 
graph. Further, since for large t, the Poisson distribution may be well approximated by 
a Normal distribution, the cumulants may be used to develop an Edgeworth expansion of 
Pt{m) around a Normal distribution to any desired accuracy. Fitting this distribution to an 
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empirically obtained ptim) distribution should then yield finer aspects of the topo 
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ogy of 



2l|. 



the neutral network than is accessible from mutational robustness studies alone 

Overdispersion of the molecular clock- Since the first cumulant is the mean and the 
second cumulant the variance, the index of dispersion R may be found as the ratio of /c(2) 
and /c(i) from Eqs. (HM and (ITB]) : 

R=l + id^L+d. (17) 

nd 

Because d^L"''d is a quadratic form associated with a positive semi-definite matrix 22], this 
shows that the molecular clock is generically overdispersed (-R > 1). -R = 1 only if the 
neutral network graph is regular because for a regular graph (and only a regular graph), 
d oc 1 lies in the null space of L and L+. Using Eq. ([6]), it is in fact trivial to show that the 
substitution process is strictly Poissonian for regular neutral network graphs, since G and L 
commute for regular graphs. For all other graphs, d will have a component orthogonal to the 
null space of L and thus result in overdispersion. This is consistent with having "fiuctuating 
neutrality", i.e., unequal neutrality across the network, for overdispersion. To examine how 
the extent of overdispersion depends on neutrality fiuctuations and other graph parameters, 
we now determine bounds on R. 

Using the spectral expansion (IT^ . we obtain 

n-1 

d^L+d = J2 (d^u«(0))' , (18) 

i=l 

n-1 

< Ar' Yl (dV^Ho))' = nAr'Var(ci), (19) 

i=l 

where we have used the fact that Ai is the second-smallest eigenvalue of L and that {u(*)(0)} 
is an orthonormal basis of eigenvectors. Var((i) denotes the variance of the degree distribu- 
tion of the graph. Noting that 2/{nd) = 1/E, this results in an upper bound on the extent 
of overdispersion: 

i?- 1 < (^) A^Var(ci). (20) 

Thus the index of dispersion is bounded from above by an interesting combination of graph 
parameters: the sparseness (as measured by the ratio E/n), the fiuctuations in neutrality 
(as measured by Var((i)), and A]~^, which has a number of interpretations. Ai is the algebraic 
connectivity of the graph 2^ and measures its overall compactness and connectivity. Also, 
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X^^ is the time scale (as measured in units of of relaxation of the distribution pj (Eq. 
([I])) to its equilibrium value n~^l. Therefore, for a fixed amount of neutrality fluctuation, 
R can be large if the neutral network is sparse (high n/E) and less well connected, or 
equivalently, if the network is sparse and the relaxation time scale is large. Since both of 
these conditions are expected to hold quite generally for large and sparse neutral networks, 
Eq. fl20l) is a weak upper bound. It is more interesting to examine the following lower bound 
on R. Returning to Eq. ( |T8l) . and using the fact that /(Aj) = is a convex function of Aj 
for positive Aj, we may apply Jensen's inequality for convex functions: 



to Eq. ( ITSl) with the choice = (d-^u*^*''(0))^. This yields a lower bound on R, namely 

^2 Var(rf)2 



R-l> 



^Er=i A.(d^u»(o))^ 



where we have used Er=i (d^u(*)(0)) = d^Ld = (1/2) ^. . dj {di - djY . Noting that 
the denominator measures the variation in the degree between neighboring nodes on the 
neutral network, we may define, analogous to Var((i), the local variation in neutrality 
LVar((i) = {'^E)^'^^^ -Gij {di — djf' , where the normalizing factor of E appears because 
the sum is a sum over the edge set of the graph, and the factor of 2 prevents double count- 
ing of edges. We therefore get the lower bound 

Thus, although fluctuating neutrality is an essential component of over dispersion within 
the neutral evolution framework, the extent of overdispersion further increases if the graph 
is more sparse and has smaller local variation in neutrality, i.e., smaller fluctuations in 
neutrality across neighboring nodes (the latter requirement was first suggested in a different 



form by Cutler 



ll|). Significantly large overdispersion is then easily realized in, say, a sparse 



neutral network with large diameter in which large global fluctuation in neutrality (degree) 
occurs as a cumulative effect of small local fluctuations in neutrality. 
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